Quantitative Analysis

Fall 2023 & Spring 2024 survey data

Primary modeling

We had two primary outcomes of interest: CIS and PIO4. The CIS scores measure career interest and PIO4 measures STEM identity. For each outcome, we fit a two-level cross-classified hierarchical linear model (HLM) with Level 1 individuals (\(i = 1,2, \dots, n_j\)) nested within Level 2 classes (\(j = 1,2, \dots J\)). Our data structure is considered cross-classified since the same student may belong to more than one class (e.g. take BIOL 151 and CHEM 151 in the same semester). Each model was of the following form:

Level 1: students

\[\text{postscore}_{ij} = \beta_0 + \beta_1 (\text{prescore}_{ij} - \overline{\text{prescore}}_j)_{ij} + \beta_2X_{2_{ij}} + \cdots \beta_pX_{p_{ij}}+\epsilon_{ij} \]

Level 2: classes

\[\beta_0 = \gamma_{00} + \gamma_{01}\text{LA}_j + \gamma_{02}\text{GE}_j + \gamma_{03}\text{LA*GE}_j + \gamma_{04}\overline{\text{prescore}}_j + \gamma_{05}W_{5_j} + \cdots + \gamma_{0q}W_{q_j} + u_{j}\]

\[\beta_1 = \gamma_{10}\] \[\vdots\] \[\beta_p = \gamma_{p0}\]

Combined:

\[y_{ij} = \gamma_{00} + \gamma_{01}\text{LA}_j + \gamma_{02}\text{GE}_j + \gamma_{03}\text{LA*GE}_j + \gamma_{04}\overline{\text{prescore}}_j + \gamma_{05}W_{5_j} + \cdots + \gamma_{0q}W_{q_j} + \gamma_{10} (\text{prescore}_{ij} - \overline{\text{prescore}}_j)_{ij} + \gamma_{20}X_{2_{ij}} + \cdots \gamma_{p0}X_{p_{ij}} + \epsilon_{ij} + u_j\]

Note, the following specifications of the model:

  • prescores are group mean centered at level 1, and the group mean is also included as a level-2 predictor. This allows for both within- and between-group variation in prescores to be accounted for in the model.
  • \(X\) denotes a generic level-1 predictor
  • \(W\) denotes a generic level-2 predictor
  • \(p\) denotes the number of level-1 predictors
  • \(q\) denotes the number of level-2 predictors
  • the treatment variable of primary interest is the level-2 variable \(LA\), which is an indicator variable for whether the class had an LA or not
  • the treatment variable is interacted with GE, a level-2 variable indicating whether the class was a General Education course or a “STEM Major Gateway” course. We expect the outcomes (STEM career interest & identity) to behave differently in these different course types, and this was also used as a stratifying variable in the research plan when assigning treatment (LA) and control (no LA) status to each course. Therefore, the GE variable and its interaction with LA is included in every model, regardless of statistical significance.
  • \(\epsilon_{ij}\) is the level-1 (individual) error term, assumed to be normally distributed
  • \(\u_j\) is the level-2 (class) error term, assumed to be normally distributed
term estimate std.error statistic df p.value
(Intercept) 10.3637091 7.0701381 1.4658425 190.46597 0.1443401
score_CIS_PRE_grp_cntr 0.6558413 0.0560807 11.6945998 63.20520 0.0000000
score_CIS_PRE_grp_avg 0.7572481 0.1568356 4.8282928 194.04150 0.0000028
ethnicity_collapsedAsian 2.0878239 0.8383434 2.4904160 115.79179 0.0141779
ethnicity_collapsedWhite 0.8603214 0.6383994 1.3476225 111.17314 0.1805197
ethnicity_collapsedOther 1.6770966 0.9940697 1.6871016 43.12047 0.0988062
stem_majorYes 1.4514459 0.6648323 2.1831761 94.82581 0.0314911
prop_stem_major_grd_cntr -4.9043843 2.8682052 -1.7099140 275.40758 0.0884081
grade_ABCB -0.8525435 0.5410869 -1.5756130 175.02184 0.1169208
grade_ABCC -1.4604493 0.7377179 -1.9796854 119.95977 0.0500287
grade_ABCD -3.4660689 0.9044460 -3.8322565 208.23528 0.0001681
grade_ABCF -3.4636456 1.0556696 -3.2809940 174.01301 0.0012498
grade_ABCW -2.4961548 1.0975515 -2.2742940 104.59269 0.0249896
subjectCHEM 2.8081829 1.4369522 1.9542633 257.94937 0.0517505
subjectMATH 3.0446094 1.4840378 2.0515713 256.80153 0.0412256
subjectPHYC 2.1204187 1.4075154 1.5064977 283.64272 0.1330523
LA_classyes -1.2417809 1.2928871 -0.9604713 269.60192 0.3376787
gate_geGE -4.5358487 2.2940811 -1.9771963 221.10967 0.0492618
LA_classyes:gate_geGE 0.3305614 2.1737311 0.1520710 244.46153 0.8792564

Interpretations:

  • Note the intercept represents the average end-of-semester CIS-score for a very specific (possibly non-existent) respondent: a Hispanic non-STEM major who received an A in their non-LA STEM gateway BIOL course and whose pre-score was average (relative to other students in their course) and whose course’s pre-score average was average relative to all other courses.

  • The positive coefficient for ethnicity_collapsedAsian suggests that on average, an Asian student will have a higher post-CIS score (+2.09) compared to a Hispanic student in the same course who received the same grade and had the same pre-score and STEM major status

  • The positive coefficient for stem_majorYes suggests that on average, a STEM major will have a higher post-CIS score (+1.45) compared to a non-STEM major of the same ethnicity in the same course who received the same grade and had the same pre-score

  • The negative coefficient for grade_ABCDD suggests that on average, a student who receives a D in their 100-level STEM course will have a lower post-CIS score (–3.47) compared to a student who receives an A in the same course, even when controlling for pre-score, ethnicity, STEM major status, and course characteristics

  • The negative coefficient for grade_ABCDF suggests that on average, a student who receives a F in their 100-level STEM course will have a lower post-CIS score (–3.46) compared to a student who receives an A in the same course, even when controlling for pre-score, ethnicity, STEM major status, and course characteristics

  • The negative coefficient for grade_ABCDW suggests that on average, a student who withdraws from their 100-level STEM course will have a lower post-CIS score (–2.5) compared to a student who receives an A in the same course, even when controlling for pre-score, ethnicity, STEM major status, and course characteristics

  • The positive coefficients for subjetCHEM, subjectMATH, and subjectPHYC suggest that on average, a student in a Chemistry, Math, or Physics course, respectively, will have a higher post-CIS score (+2.81, +2.81, ) compared to a similar student (same pre-score, ethnicity, and course grade) in a similar Biology course (same pre-score average, proportion of STEM majors in the course, LA status, and GE/Gateway status). Note that only the difference between Math and Biology is statistically significant, however.

  • The negative coefficient for gate_geGE suggests that on average, a student in a 100-level STEM GE course will have a lower post-CIS score (–4.54) compared to a similar student (same pre-score, ethnicity, and course grade) in a similar 100-level STEM Gateway course (same pre-score average, proportion of STEM majors in the course, LA status, and subject (BIOL/CHEM/MATH/PHYS))

  • Pre-scores are strong predictors of post-scores, as is to be expected. All other variables in the model are not statistically significant, including the treatment indicator for whether or not there was an LA in the class. That is, there is not sufficient evidence to suggest that having an LA in a course improves student STEM Career Interest (in fact the coefficient is negative, albeit non-significant).

CIS-item analysis

Overall CIS POST scores

by hispanic

Hispanic respondents are less likely than non-Hispanic respondents to agree overall they are able to get a good grade, though they are about equally likely to say they will work hard and are able to complete their homework.

They are also less likely to agree overall that their parents would be happy if they chose a science/math career, that they like their science/math classes, that they have a role model in a science/math career, and that they would feel comfortable talking with someone in a math/science career.

Hispanics respondents are also less likely to strongly agree that they plan to use science/math in their future career, that they are interested in science/math careers, that if they do well in their science/math classes it will help them in their future career, and that they know someone in their family that uses science/math in their career. However, about the same proportions of Hispanics and non-Hispanics agree overall with these four statements.

by sex

by STEM_ID

by honors

by first_gen

by ethnicity

by intl_stdnt

by disability

PIO-4 item analysis

Overall PIO4 POST scores

by hispanic

by sex

STEM_ID

by honors

by first_gen

by ethnicity

by intl_stdnt

by disability

Table of item means by demographics

Note these are the same values as shown in the visualizations above.